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Abstract 

A version of the secretary problem called the duration problem, in which the objective is to maximize 
the time of possession of relatively best objects or the second best, is treated. It is shown that in 
this duration problem there are threshold numbers (fci , fej) such that the optimal strategy immediately 
selects a relatively best object if it appears after time fei and a relatively second best object if it appears 
after moment fej. When number of objects tends to infinity the thresholds values are [0.417188A^J and 
J 0.120381A''J , respectively. The asymptotic mean time of shelf life of the object is 0.403827Af. 

Keywords: Optimal stopping; Relative ranks; Best-choice problem; duration problem; Dynamic 
programming 
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1 Introduction and summary 

The duration problem for the classical no-information secretary problem has been investigated for the first 
time by Ferguson, Hardwick and Tamaki It is a sequential selection problem which is a variation of 
the classical secretary problem (CSP), treated for example, by Gilbert and Mosteller [ID]. The aim in CSP 
is to examine items ranked from 1 to A'^ by random selection without replacement, one at a time, and to 
win is to stop at any item whose overall (absolute) rank belongs to the given set of ranks (in the basic 
version this set contains the rank 1 only), given only the relative ranks of the items drawn so far. Since 
the articles by Gardner [HI [9] the secretary problem has been extended and generalized in many different 
directions. Excellent reviews of the development of this colourful problem and its extensions have been 
given by Rose [TS], Freeman [7], Samuels [IS] and Ferguson [3]. 

The basic form of the duration problem can be described as follows. A set of N rankable objects appears 
as in CSP. As each object appears, we must decide to select or reject it based on the relative ranks of the 
objects. The payoff is the length of time we are in possession of a relatively best object. Thus we will only 
select a relatively best object, receiving a payoff of one as we do so and an additional one for each new 
observation as long as the selected object remains relatively best. 

Though Ferguson, Hardwick and Tamaki [5] considered the various duration models extensively, they 
confined themselves to the study of the shelf life the relatively best items. In his seminal paper Gnedin [TT] 
has shown that such problems are equivalent to the analogous best-choice problems with random horizon 
N, uniformly distributed from 1 to n. 

In this paper our goal is different. We attempt to extend the problem to choose and keep the best or the 
second best items. For simplicity we refer to a relatively best or a second best object as a candidate. We 
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receive each time a unit payoff as long as either of the chosen objects remains a candidate. Obviously only 
candidates can be chosen, the objective being to maximize expected payoff. This problem can be viewed 
from another perspective as follows. Let us observe at moment i the relatively second candidate and let us 
denote T(i) the time of the first candidate after time i {i.e. the relatively best or the second best item) if 
there is one, and + 1 if there is none. If we observe at i the relatively best item then T{i) is the moment 
when new item appears which changes the relative rank of ith item to the no candidate rank. The time 
T(i) — i is called duration of the candidate selected at time i. The objective is to find a stopping time t*) 
such that 
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where r denotes the stopping time. 

This problem will be discussed in Section [2] A Markov chain model will be formulated and the optimal 
strategy in Section 2.2 will be derived. This section is based mainly on the suggestion from [2] and the 
results by [TH] and |17j . It can be shown that, there exists an optimal threshold stopping time such that, it 
immediately selects a best candidate if it appears after or on time k\ and it immediately selects a second 
best candidate if it appears after time > k^. In Section [3] we investigate the asymptotics as n oo. 
ki/n proves to converge to a = 0.120381 and /cj/n to 6 = 0.417188. The asymptotic mean time of shelf life 
of the relatively best or the second best object is 0.403827-/V. 



2 Markov model for the shell life of the best and the second best 

The models we consider here are so called no information model where the decision to select an object is 
based only on the relative ranks of the objects observed so far. Let §={1,2,..., N} be the set of ranks of 
items and {xi, X2, ■ . ■ , xn} their permutation. We assume that all of them are equally likely. If Xk is rank 
of fc-th candidate we define 

>fc = #{1 < i < fc : X, < Xk}. 

The random variable Yk is called relative rank of fc-th candidate with respect of items investigated to the 
moment fc. 

We observe sequentially the permutation of items from the set §. The mathematical model of such 
experiment is the probability space (f2,J^, P). The elementary events are permutations of the elements 
from S and the probability measure P is uniform distribution on fl. The observation of random variables 
Yk, k — 1,2, . . . , N, generate the sequence of a-fields Tk — (j{Yi,Y2, . . . , Yfc}, fc = 1, 2, . . . , iV. The random 
variables Yfe are independent and P{yfc = i} = j:. 

Denote by the set of all Markov moments t with respect to cr-fields {J-'k}k=i- The decision maker 
observe the stream of relative ranks. When Yi E A = {1,2} it is the potential candidate for the absolutly 
first or second item. Sometimes it is enough to keep such candidate by some period to get profit which 
is proportional to the shell file of candidate. The random variable Ti is defined as the moment when the 
keeping candidate stops to be the candidate. It can be described by Yg ior s = i,i + 1, ... ,Ti. 

2.1 Distribution of Tj 

There are two cases: 
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Yi = 2 : in this case T, = k when Yi = 2, Yi+i > 2, Yi+2 > 2, . . . , Yk-i > 2, Yfe G A. We have for i < A; < 

P{T, = fc|K, -2} = P{y, = 2,K,+i>2,y,+2>2,...,n_i>2,rfeeA|K, = 2} (2a) 

k- 3 2 2{i-l)i 



i — 1 i 



: and 



i+li + 2 k-lk {k-2){k~l)k 

p{T, = A^ + i|r, = 2} = p{y, -2,r,+i >2,y,+2 >2,...,yAr_i >2,iw >2|y, -2} (2b) 

N 

= 1- E 



2{i~ l)i 



(s-2)(s-l)s iV(iV-l)' 

Yi = 1 : the random variable = fc if there exists s G {i + 1, . . . , fc— 1} such that Yi = 1, Fi+i > 1, li+2 > 
1, . . . > l,n - > 2, . . . > 2,n G a. We have for i < fc < iV 

{fc-i 
U {y, = i,r,+i>i,y,+2>i,...,i;_i>i,n = i, (3a) 

Ys+i >2,...,Yk-i> 2, Yk G A}\Y, = 1} 

fc-i 



E 

5 — 1 + 



2(s- l)s 



2i(fc - I - 1) 



^ (s- l)s(fc-2)(fc- l)/c {k-2){k-l)k 



and 

P{T, = A^+1|K, = 1} 



TV 



i-p<^ y {K, = i,K,+i>i,...,n_i>i,n = i, 

U=i+1 

n+i > 2, . . . , Ffc^i > 2, > 2}|y, = 1} 
(^-i)^(fc-2)(fc-i)^ 

^ 2i{k~-i~l) _^ {N -i){N -i-l) _2Ni~i 

^ (fc - 2){k - i)k ~ NiJ<rrT\ ~ N(N - 
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The solution of the problem ([T|) will be performed by its change to the optimal stopping problem for 
the embedded Markov chain. 



2.2 The optimal stopping problem for the embedded Markov chain 

Let us observe that for any r G 9Jl^ 
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r=l -^{r^r} 
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= E/ E{\^\Yr}dP^E^iT,Y,). 

In the following lemma the function Lp{-) is calculated. The final form of it is using the the digamma 
function (F-function) ipn{z) (see Abramowitz and Stegun [1] p. 260). For n = we will use denotation 
ijj{z). This function is defined as nth logarithmic derivative of the Euler gamma function r(z) 



'0n(2) 



jn+l 

+^lnr(z) 

d" r'(z) 

cfe" T{z) ■ 



3 



Lemma 2.1. The payoff function (p{k,r) has the form 



+ 2N^{k) + 2N^{N)) for r = I, 



k{N-k+l) 

jV2 



for r = 2, 
otherwise. 



(4) 



Proof. Based on the distribution of the random variable and the equality 1) ~ V'(p) — \ for 

the digamma function we get 



<p{k,l) = 



= l}=Ar ^ {s-k)V{Ti = s\Y, = l}\ 

\s=fe+l / 



2k{s - k - l){s - k) 



+ {N+l-k) 



2Nk -k'^-k 
N{N-1) 



{1 + k-N- 2N{ijj{k) - i^{N))) 



<p{k,2) = 



= 2}=]^ E is-k)P{n = s\Yk = 2}\ 

\s=k+l J 



1 f ^ 2fc(fc-l)(,s-fc) ^ ^ ^ fc(fc-l) \ _ fc(iV-fc + l) 



N \ ^ s(s- l)(s-2) 



N{N - 1) 
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2.2.1 Recursive algorithm 

Let DJlff = {r e IHJl^ : r < t < N} and wjv(r) = sup^gj^w E(/j(t, Fx)- The following algorithm allows to 
construct the value of the problem vn = wn{1, 1). 



wn{N) = 'E^{N,Yn) = 



N' 



Let 



Wat (TV, r) 
wjv(fc,r) 

WN{k) 



1, if r e A, 
0, otherwise, 

max{(p(fc,r),EwAr(r + l,yr+i)}, 
1 

Eu;jv(fc,Ffc)= - y^wjv(fc,r). 



(6) 

(7a) 
(7b) 

(7c) 



We have then wjv = wm{^)- The optimal stopping time r* is defined as follows: one have to stop at the first 
moment k when Yj, = r, unless WN{k,r) > ip{k,r). We can define the stopping set F = {(fc,r) : ^{k,r) > 

WN{k)}. 
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2.2.2 Embedded Markov chain 

Let a = ma,x{A). The function ip{k,r) defined in ^ is equal to for r > a and non-negative for r < a. 
It means that it is rational to choose item for keeping at moment k when the state {k, r) such that r < a. 
Define = (l,Fi) = (1, 1), it = inf{r > it-i ■ Yr < min(a,r)} (inf = oo) and Wt = ht,Y^,). If 7t = oo 
then define Wt — (oo, oo). Wt is the Markov chain with the state space E = {(s, r) : s S {1, 2, . . . , TV}, r £ 
A} U {(oOjCX))} and the following one step transition probabilities (see [TFj ') 
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if r < a, s — r + 1, 



p{r,s) = P{Wt+i = isJ,)\Wt = ir,lr)}= {j^, if a < r < s, (8) 



0, if r > s or r < a, s ^ 



r 



with p{oo, oo) = 1, p{r, oo) = 1 - a I]fLr+i P(''' where (s)a = s(s - l)(s - 2) . . . (s - a + 1), (s)o = 1. We 
denote Tip{k, r) = E(;; ,,) v(W^i) the mean operator for the function g : E ^ 5R. Let Qt — (j{Wi, W2, ■ ■ ■ , Wt} 
and DJl^ be the set of stopping times with respect to {Gt}tLi- Since 74 is increasing, then we can define 
S^f+i = {(7 e : 7, > r}. 

Let P(fc_r)(') be probability measure related to the Markov chain Wt, with trajectory starting in state 
{k,r) and E(fc^-)(-) the expected value with respect to P(fcr)(')- From (Isl) we can see that the transition 
probabilities do not depend on relative ranks, but only on moments where items with relative rank 
r < min(a, k) appear. Based on the following lemma we can solve the problem ([T|) with gain function Q 
using the embedded Markov chain {Wt}. 

Lemma 2.2. (see fT^) 

Ewjv(fc+ ^,Yk+i) = E(fe_r)WAr(Wi) for every r < min(a,fc). (9) 

2.2.3 Solution of the optimal shelf life problem 

First of all the form of T(p{k, r) for (fc, r) G E will be given. 

Lemma 2.3. The expected payoff of the fuction (p{-) with relation to the embedded Markov chain {Wt, Gt, P(i,i))tLo 
has the folloing form: 

Proof. The definition of the embedded Markov chain ([s]) and the payoff function ip{-) in the lemma 
Ogive 

N 1 

Tip{k,r) = ^ ^P{k,j)(p{j,r) 

j=k+l r=l 

^ kjk - 1) / j(2iV(^(iV) - m) + N-J-1) j{N + 
,^iJ'(j-l)(j-2) I iV2 + m 

^ fc(fc^l) / j{N~j-l) 2N 2N 



5 



Table 1: Decision points and values of the problem 



N 


1 * 




vn 


10 


1 


4 


0.527526 


20 


2 


8 


0.464357 


30 


3 


12 


0.442977 


40 


4 


16 


0.432325 


50 


6 


21 


0.426411 


60 


7 


25 


0.422846 


70 


8 


29 


0.420142 


80 


9 


33 


0.418024 


90 


10 


37 


0.416322 


100 


12 


41 


0.415064 


200 


24 


83 


0.409431 


500 


60 


208 


0.406064 


1000 


120 


417 


0.404944 


oo 


[0.120381N] 


[0.417188N] 


0.403827 



Let us denote Ak{r) — {{s, r) : s > k}. 

Theorem 2.1. There are constants and fcj such that the optimal stopping time for the problem |7p he 
the form 

T* ^mi{t:WteAkiUAk^J. 

The value function 



Vpf{kl,k*) 



{N{3N - 4) - 3) + k^N - 3)^P{kt) + k^ {2{N^ - 1) (^i(fc^ + 1) - + 1))) 

{N ~1)N 
fcj' (2(jV - l)^(jV) + (5 - 3jV)^(fc;)) 
{N - l)N 

k^ (37V3 (2/c5 - 3)iV2 - 2 {kf + k^ + 2) N + kf + fcj) 
{N-l)N'^k* 

Proof. The payoff function (p{-, r) for r £ A are unimodal. It can be seen by analysis of the differences 
(p{k +1,1) — Lp{k, 1) which is decreasing for k < N — 1. The compare of events related to Tj, —jonYk — 1 
and Yk — 2 leads to the conclusion that (p{k, 1) > (p{k, 2) for fc G {1, 2, . . . , N}. The value function lij-fc) is 
noincreasing by the fact of decrasing number of stopping times in 971^ . At fc = — 1 both payoff functions 
are greater than wn{N — 1). Let us be fcj — inf{l < k < N : Tip{k, i) < (p{k, 2)} — 1. We have for fc > fc2 
and r = 1,2 that WN(k,r) — (p(k,r) and WNik) = Tip(k,r). Let us denote fc^ = inf{l < fc < fcj • WN^r) < 
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(/?(fc, 1)}, where WN{k) — v(k, k2) and for k < s we have 

k k 
VN{k,s) = V -7^. -Tip{j,l) + -wn{s) 

7(1 — 1) S 



{N{3N - 4) - 3) + k{N - 3)V>(fc) + k {2{N - l)ipiN) + (5 - 3jV)^(s)) 

{N-1)N 

fc(2(jV^-l)(V>i(g + l)-V;i(fc + l))) 
{N -1)N 

k {3N^ + (2s - 'S)N^ - 2 (§2 + s + 2) TV + s2 + s) 
(AT - l)iV2s ■ 



The numerical examples of the solution of the shelf life problem with different horizon are given in the 
table m 

3 Asymptotic duration problem 

3.1 The limes of the finite horizon problem 

Let the number of candidates goes to infinity. For such large number of candidates we can find optimal 
solution of (jlj based on the following consideration. As ^ cx) such that ^ x £ {0,1] the embedded 
Markov chain {Wt,J-t,P{i,i)) with state space E = {1,2, ...,A^} x {1, 2, . . . , max(yl)} can be treated as 
Markov chain (W^j , J^t, P(_^ j)) on . . . , 1} x {1, 2, . . . ,max(^)}. 

Lemma 3.1. The gain function ip{[Nx],r) has limit 



x^ — 2 log(x)a; — x for r — 1, 
x{\ — x) for r — 2. 



Proof. Let us limit the formula (p{k,r) given in (Sal. We get 



ip{x, 1) = lim (p{k, 1) (IIe^) 



N 



1(4^ 2k{s - k - l)(s - k) , ,,2Nk-k^-k 

hm — > — ^ -r^ — + (N+l-k) — 

s{s-l)is-2) ^ > N{N-1) ^ 

N-^OQ 

^ 2x(z — x){z ~ x) 



„2 



dz+{l -x){2x-x'^) 



= -2\og{x)-x + x 

andi^(a;,2) = lim ip{k,2) (lib) 

N 

N-^oo 

= i™ — jp — = x{i-x). 

N 
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We get also liniTv^oo ,,)<y9(Wi) = where (W^ ,J^t,'P{x,i)) is Markov chain with the 

state space (0, 1] x {1,2,..., ma,x{A)} and the transition density function 



The expected value with respect to the conditional distribution given in (12 1 is following 



— 1 Jx 



The recursive formulae ([6|)-([7c| in asymptotic case have the form 

v{l) = (14a) 

w{x,r) = max{.^(a;,r),E(j,^r)U'(W^i )}, (14b) 

v{x) = E^,^r)w{w'^). (14c) 

The function w{x,r) is the limit of WN{k,r), when ^ x G (0,1], i.e. limjv— >oo w;Ar([A^x], r) — wlxjv). 



The asymptotic solution we get by 'recursive' method based on (14a)-(14c 



3.2 Asymptotic solution of shelf life problem 

It is of interest to investigate the asymptotic behaviors of fc^, and ujv N tends to infinity. The 
algorithm presented in the section [3TT] is used. Based on the lemma [2?3| we get 



Lemma 3.2. lim j^^^ Tip{k, r) ~ 2 * (a;^ — x — xlog(a;)) and 

N^oc 

b=lunJl^-lwi-leM-l))^0.4mSS, (15) 
where W{-) is the Lambert W-functioi^ (cf. Polya and Szego llSf ) 



Proof. From the lemma 2.3 we get easily the limit of Tip{k,r). It allows to formulate the equation 

^2 
AT 



which the b = limAr^oo -w should fulfil. It is 



-2(a;log(a;) + x - x ) = x{l - x). 

After simple algebra we get that b should fulfil the equation — |exp(— |) = — |&exp(— 16). The inverse 
function to the function h{x) = we^ is the Lambert H^- function. It gives the solution (15 1. 



4 Final remarks 

Thus far we have implicitly assumed that the object, once chosen, are possessed until the process terminates. 
It is possible to extend solution for multiple choice duration problem, silmilarly as in [19 . It will be subject 
of further research. It is also unknown to the authors if the solution of this duration problem has similar 
solution as the related best choice problem with rundom number of objects available. Such coincidence has 
place for the best candidate duration problem and the best choice problem with the random number of 
objects avalilable considered by Presman and Sonin |14j . This observation has been given in [S]. 

^This function was introduce by Euler [3] with relation to the Lambert transcedental function investigated by Lambert 

m 
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